本文解决了对象识别的问题,给出了一组图像作为输入(例如,多个相机源和视频帧)。基于卷积神经网络(CNN)的框架不会有效地利用这些集合,处理如观察到的模式,而不是捕获基础特征分布,因为它不考虑集合中的图像的方差。为了解决这个问题,我们提出了基于基于CNNS的CNNS作为分类器的NN层,作为分类器的NN层,可以更有效地处理图像,并且可以以端到端的方式训练。图像集由低维输入子空间表示;并且此输入子空间与参考子空间匹配,通过其规范角度的相似性,可解释和易于计算度量。 G-LMSM的关键思想是参考子空间被学习为基层歧管的点,用黎曼随机梯度下降而优化。这种学习是稳定,高效,理论上的接地。我们展示了我们提出的方法在手工形状识别,面部识别和面部情感识别方面的有效性。
translated by 谷歌翻译
深度学习(DL)技术被回归问题所接受。最近在该领域发表的论文数量越来越多,包括调查和评论,表明,由于效率和具有高维数据的系统的良好精度,深层回归引起了社区的关注。但是,许多DL方法具有复杂的结构,这些结构对人类用户不易透明。访问这些模型的可解释性是解决敏感领域问题(例如网络安全系统,医疗,金融监视和工业过程)的重要因素。模糊逻辑系统(FLS)是可解释的模型,在文献中众所周知,能够通过具有成员资格学位的语言术语对复杂系统使用非线性表示,模仿了人类的思想。在可解释的人工智能的气氛中,有必要考虑开发智能模型的准确性和可解释性之间的权衡。本文旨在调查结合DL和FL的现有方法的最新方法,即深度模糊系统,以解决回归问题,配置当前在文献中尚不充分探索的主题,因此应进行全面调查。
translated by 谷歌翻译
最近,由于许多用例的性能要求严格的性能要求,基于意图的管理正在受到电信网络的良好关注。文献上的几种方法采用电信域中的传统方法来满足KPI的意图,可以将其定义为封闭环。但是,这些方法考虑了每个闭环相互独立的环路,从而降低了组合的闭环性能。同样,当需要许多闭环时,这些方法不容易扩展。在许多领域,多机构增强学习(MARL)技术在许多领域都表现出了巨大的希望,在许多领域中,传统的闭环控制效果不足,通常用于循环之间的复杂协调和冲突管理。在这项工作中,我们提出了一种基于MARL的方法,以实现基于意图的管理,而无需基础系统模型。此外,当存在相互矛盾的意图时,MARL代理可以通过优先考虑重要的KPI来暗中激励循环,而无需人工互动。已经在网络模拟器上进行了实验,以优化三种服务的KPI,我们观察到拟议的系统的性能良好,并且在资源不足或资源稀缺时能够实现所有现有的意图。
translated by 谷歌翻译
通常,基于生物谱系的控制系统可能不依赖于各个预期行为或合作适当运行。相反,这种系统应该了解未经授权的访问尝试的恶意程序。文献中提供的一些作品建议通过步态识别方法来解决问题。这些方法旨在通过内在的可察觉功能来识别人类,尽管穿着衣服或配件。虽然该问题表示相对长时间的挑战,但是为处理问题的大多数技术存在与特征提取和低分类率相关的几个缺点,以及其他问题。然而,最近的深度学习方法是一种强大的一组工具,可以处理几乎任何图像和计算机视觉相关问题,为步态识别提供最重要的结果。因此,这项工作提供了通过步态认可的关于生物识别检测的最近作品的调查汇编,重点是深入学习方法,强调他们的益处,暴露出弱点。此外,它还呈现用于解决相关约束的数据集,方法和体系结构的分类和表征描述。
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.
translated by 谷歌翻译
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.
translated by 谷歌翻译
When robots learn reward functions using high capacity models that take raw state directly as input, they need to both learn a representation for what matters in the task -- the task ``features" -- as well as how to combine these features into a single objective. If they try to do both at once from input designed to teach the full reward function, it is easy to end up with a representation that contains spurious correlations in the data, which fails to generalize to new settings. Instead, our ultimate goal is to enable robots to identify and isolate the causal features that people actually care about and use when they represent states and behavior. Our idea is that we can tune into this representation by asking users what behaviors they consider similar: behaviors will be similar if the features that matter are similar, even if low-level behavior is different; conversely, behaviors will be different if even one of the features that matter differs. This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not. The notion of learning representations based on similarity has a nice parallel in contrastive learning, a self-supervised representation learning technique that maps visually similar data points to similar embeddings, where similarity is defined by a designer through data augmentation heuristics. By contrast, in order to learn the representations that people use, so we can learn their preferences and objectives, we use their definition of similarity. In simulation as well as in a user study, we show that learning through such similarity queries leads to representations that, while far from perfect, are indeed more generalizable than self-supervised and task-input alternatives.
translated by 谷歌翻译
and widely used information measurement metric, particularly popularized for SSVEP- based Brain-Computer (BCI) interfaces. By combining speed and accuracy into a single-valued parameter, this metric aids in the evaluation and comparison of various target identification algorithms across different BCI communities. To accurately depict performance and inspire an end-to-end design for futuristic BCI designs, a more thorough examination and definition of ITR is therefore required. We model the symbiotic communication medium, hosted by the retinogeniculate visual pathway, as a discrete memoryless channel and use the modified capacity expressions to redefine the ITR. We use graph theory to characterize the relationship between the asymmetry of the transition statistics and the ITR gain with the new definition, leading to potential bounds on data rate performance. On two well-known SSVEP datasets, we compared two cutting-edge target identification methods. Results indicate that the induced DM channel asymmetry has a greater impact on the actual perceived ITR than the change in input distribution. Moreover, it is demonstrated that the ITR gain under the new definition is inversely correlated with the asymmetry in the channel transition statistics. Individual input customizations are further shown to yield perceived ITR performance improvements. An algorithm is proposed to find the capacity of binary classification and further discussions are given to extend such results to ensemble techniques.We anticipate that the results of our study will contribute to the characterization of the highly dynamic BCI channel capacities, performance thresholds, and improved BCI stimulus designs for a tighter symbiosis between the human brain and computer systems while enhancing the efficiency of the underlying communication resources.
translated by 谷歌翻译
A step-search sequential quadratic programming method is proposed for solving nonlinear equality constrained stochastic optimization problems. It is assumed that constraint function values and derivatives are available, but only stochastic approximations of the objective function and its associated derivatives can be computed via inexact probabilistic zeroth- and first-order oracles. Under reasonable assumptions, a high-probability bound on the iteration complexity of the algorithm to approximate first-order stationarity is derived. Numerical results on standard nonlinear optimization test problems illustrate the advantages and limitations of our proposed method.
translated by 谷歌翻译